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1. Proposed method of total observation and 
forecast error variance correction is based on 
the assumption about normal distribution of 
“observed-minus-forecast” residuals (O-F), 
where O is an observed value and F is usually 
a short-term model forecast. This assumption 
can be accepted for several types of 
observations (except humidity) which are not 
grossly in error (Andersson and Jarvinen 
1999, Dharssi et al. 1992, Hollingsworth et al. 
1986 , Jarvinen and Unden 1997, Lorenc 
and Hammon 1988 ). 

Degree of nearness to normal distribution can 

be estimated by the symmetry or skewness 

{luck of symmetry) a^ = P3A7 3 

and kurtosis 84 = pVa 4 - 3 

Here p* = i-order moment, a is a standard 

deviation. It is well known that for normal 

distribution aj = 34 = 0 . 

Table 1 contains a^ and a* for O-F’s of 
several types of observations: rawinsonde 
heights, winds and mixing ratio, aircraft 
winds, cloudtrack winds, and surface heights 
(recast as upper air) and winds. Six-hour 
model forecasts (F) were obtained using the 
Goddard Earth Observing System 4.0.3 
assimilation run for October 1-31, 2003. 

Figs. 1-7 show O-F histograms for these 
observations (without gross errors). 


Distributions of O-F’s corresponding to 
rawinsonde heights and winds, aircraft 
winds, cloudtrack winds, surface heights and 
winds are close to the normal distribution. 
The rawinsonde mixing ratio distribution 
can not be considered normal. The kurtosis 
value for mixing ratio observations is also 
very large. 

2. If a random variable X has normal 
distribution, then according to the statistical 
rules probability of X to be within the 
interval (0, aa) is: 

P(0 < X <acr) = F(a) - F(0) = F(a) - 0.5 (1) 
Here 

x 

F(x) = M-Jln j e ~' 2/2 dt , 

—co 

a - standard deviation of X, a - arbitrary 
constant, F - standard normal distribution 
function. 

Suppose we have a number (percentage) of 
observations within some interval (0,aa). 

If the number does not correspond to (1), it 
can mean we are using a that is not standard 
deviation for these observations . 

However this number can be used to correct 
the standard deviations. 


Tablet. The symmetry aj and the kurtosis ay. 



rawinhght 

rawinwind 

rawinhumd 

aircrftwind 

cldtrkwind 

srfheight 

srfwind 

33 

0.13 

-0.01 

-0.08 

0.01 

-0.57 

0.02 

0.04 

34 

0.87 

-1.12 

6.08 

-0.67 

0.41 

-1.07 

-1.74 
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For this purpose it is convenient to apply so 
called “background check” procedure which 
is often used as a part of quality control in 
data assimilation systems (Andersson and 
Jarvinen 1999, Dee et al. 1999) . 

The background check performs an 
examination of all observations against the 
short range (6 hours) forecasts. Actually, the 
inequality is verified for each observation: 

(O-F) 2 < a 2 {(&° ) 2 + (d f ) 2 } (2) 

where a is a tolerance parameter, d° and 

<7 f are respectively appropriate prescribed 
observation and forecast error standard 
deviations. If for some observation the 
inequality (2) is not fulfilled, the observation 
is marked as suspect. The prescribed values 

d° and d f can approximately represent 

the true observation and forecast error 
standard deviations . 

Let a - {(a 0 f + ( d f ) 2 ) v 2 and 

a = {(cr°) 2 +(<J / ) 2 } 1/2 , where < 7 ° and <7 f 
are true standard deviations. Instead of (2) 
we can write: 

| O-F \<a(dlo)<7 

Suppose M is the percentage of suspect 
observations which was obtained by some 
background check. 

It means: 

P(0 < O-F < ad) = 0.5 - M/(2*100) 

But according to (1) we have: 

P(0 < O-F < ad) - F(a<37<5) - 0.5 
Then 

F(a<r/a) = 1 - M/(2*100) (3) 

Using the table of standard normal 
distribution function we can find the value 
ad /<5 = m, corresponding to 1 - M/(2*100). 
Then we can find c = a d/m (4) 

3. Consider one example. Let a- 2. Suppose 


a result of the background check gave 
suspect observation percentage of 4.6. Then 
1 - alii* 100) = 0.977. 

From (3) and the standard normal 
distribution table we have 2* d/<5 = 2, and 
<7 = d . That is d is specified correctly. 
But it corresponds to the known statistical 
rule: about 4.6% of observations should be 
beyond 2o. 

4. Conclusions: 

a) Using results of a background check 

the prescribed statistics of ( d° ) 2 + (d J ) 2 

can be corrected. Then the background 
check can be repeated with the new 

<7 = {((7°) 2 +{(T f ) 2 } m . 

b) The equation (4) together with the 
results of appropriate background check can 
be considered as a relation between 
observation and forecast error standard 
deviations. If the true value of <7° is known, 
we can calculate (7 f . 

c) Consider results of two background 
checks for the same observation type, but for 
different instruments. Then using (4) we can 
write two equations for true sigmas : 

« ) + (of ) 2 = s l 

(C7° 2 ) 2 + (of ) 2 = S 2 

Because o( = of for both types of 
observations , subtraction of one equation 
from the other gives a relation between two 
observation error statndard deviations. If one 
of them can be found easier than the other, 
the relation can be used to get the second 
one through the first one. For example, 
observation error standard deviation for 
TO VS heights can be found through the 
rawinsonde height observation error 
standard deviation. Analogously, cloud track 



wind observation error standard deviation 
can be found through the rawinsonde wind 
observation error standard deviation. 
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Figure 1. Histogram of O-F rawinsonde heights. Global. All levels. October 2003. 
Number of observations = 457919. Yelow color corresponds to observations that were 
marked as suspect by background check, but passed the adaptive buddy check (Dee et 
all. 1999). The blue curve shows the Gaussian distribution for the same mean and 
standard deviation. 









of observations = 85776 








Figure 6. As in Fig.l but for surface geopotential heights. Number of observations 
463369. 



Figure 7. As in Fig.l but for surface winds. Number of observations - 123254. 




